Using pymldb Tutorial

Interactions with MLDB occurs via a REST API. Interacting with a REST API over HTTP from a Notebook interface can be a little bit laborious if you're using a general-purpose Python library like requests directly, so MLDB comes with a Python library called pymldb to ease the pain.

Connections

The pymldb library includes a class called Connection. The recommended usage pattern is shown here:


In [1]:
from pymldb import Connection
mldb = Connection("http://localhost")

Accessing the REST API

Once you have a connection object, you can easily make calls to the REST API:


In [2]:
mldb.get("/v1/types")


Out[2]:
GET http://localhost/v1/types
200 OK
[
  "datasets", 
  "functions", 
  "plugin.setups", 
  "plugin.startups", 
  "plugins", 
  "procedures"
]

In [3]:
#keyword arguments to get() are appended to the GET query string

mldb.get("/v1/types", x="y")


Out[3]:
GET http://localhost/v1/types?x=y
200 OK
[
  "datasets", 
  "functions", 
  "plugin.setups", 
  "plugin.startups", 
  "plugins", 
  "procedures"
]

In [4]:
#dictionaries arguments to put() and post() are sent as JSON via PUT or POST

mldb.put("/v1/datasets/sample", {"type": "sparse.mutable"} )


Out[4]:
PUT http://localhost/v1/datasets/sample
201 Created
{
  "status": {
    "columnCount": 0, 
    "rowCount": 0, 
    "valueCount": 0
  }, 
  "config": {
    "type": "sparse.mutable", 
    "id": "sample"
  }, 
  "state": "ok", 
  "type": "sparse.mutable", 
  "id": "sample"
}

Here we create a dataset and insert two rows of two columns into it:


In [5]:
mldb.put( "/v1/datasets/demo",      {"type":"sparse.mutable"})
mldb.post("/v1/datasets/demo/rows", {"rowName": "first", "columns":[["a",1,0],["b",2,0]]})
mldb.post("/v1/datasets/demo/rows", {"rowName": "second", "columns":[["a",3,0],["b",4,0]]})
mldb.post("/v1/datasets/demo/commit")


Out[5]:
POST http://localhost/v1/datasets/demo/commit
200 OK

SQL Queries

Now that we have a dataset, we can use the query() method on the connection to run an SQL query and get the results back as a Pandas DataFrame:


In [6]:
df = mldb.query("select * from demo")
print type(df)


<class 'pandas.core.frame.DataFrame'>

In [7]:
df


Out[7]:
a b
_rowName
second 3 4
first 1 2

Where to next?

Check out the other Tutorials and Demos.


In [ ]: